Multi-Document Discourse Parsing Using Traditional and Hierarchical Machine Learning
نویسندگان
چکیده
Multi-document handling is essential today, when many documents on the same topic are produced, especially considering the Web. Both readers and computer applications can benefit from a discourse analysis of this multidocument content, since it demonstrates clearly the relations among portions of these documents. This work aims to identify such relations automatically using machine learning techniques. Particularly, this work focuses on the identification of relations predicted by the Cross-document Structure Theory (CST). The obtained results improve the state of the art.
منابع مشابه
Machine Learning Approaches to Shallow Discourse Parsing: A Literature Review
This document reviews the literature on shallow discourse parsing, in particular the use of machine learning techniques. This is deliverable Y1.M6 of the Discourse Parsing White Paper which is part of the MDM IP of the IM2 project.
متن کاملDiscourse Parsing with Attention-based Hierarchical Neural Networks
RST-style document-level discourse parsing remains a difficult task and efficient deep learning models on this task have rarely been presented. In this paper, we propose an attention-based hierarchical neural network model for discourse parsing. We also incorporate tensor-based transformation function to model complicated feature interactions. Experimental results show that our approach obtains...
متن کاملUnsupervised Learning for Natural Language Processing
Given the abundance of text data, unsupervised approaches are very appealing for natural language processing. We present three latent variable systems which achieve state-of-the-art results in domains previously dominated by fully supervised systems. For syntactic parsing, we describe a grammar induction technique which begins with coarse syntactic structures and iteratively refines them in an ...
متن کاملHybrid Approach to PDTB-styled Discourse Parsing for CoNLL-2015
This paper describes our end-to-end PDTB-styled discourse parser for the CoNLL-2015 shared task. We employed a machine learning-based approach to identify discourse relation between text spans for both explicit and implicit relations and employed a rule-based approach to extract arguments of the discourse relations. In particular, we focus on improving the implicit discourse relation identifica...
متن کاملCSTParser – a multi-document discourse parser
This paper presents the CSTParser, a multi-document discourse parser. Based on machine learning techniques and hand-crafted rules, the system identifies a set of relations predicted by CST (Cross-document Structure Theory) among sentences of different texts on the same topic.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011